Classification of general audio data for content-based retrieval
نویسندگان
چکیده
In this paper, we address the problem of classi®cation of continuous general audio data (GAD) for content-based retrieval, and describe a scheme that is able to classify audio segments into seven categories consisting of silence, single speaker speech, music, environmental noise, multiple speakers' speech, simultaneous speech and music, and speech and noise. We studied a total of 143 classi®cation features for their discrimination capability. Our study shows that cepstralbased features such as the Mel-frequency cepstral coecients (MFCC) and linear prediction coecients (LPC) provide better classi®cation accuracy compared to temporal and spectral features. To minimize the classi®cation errors near the boundaries of audio segments of dierent type in general audio data, a segmentation±pooling scheme is also proposed in this work. This scheme yields classi®cation results that are consistent with human perception. Our classi®cation system provides over 90% accuracy at a processing speed dozens of times faster than the playing rate. Ó 2000 Elsevier Science B.V. All rights reserved.
منابع مشابه
A New Approach For Classification Of Generic Audio Data
The existing audio retrieval systems fall into one of two categories: single-domain systems that can accept data of only a single type (e.g. speech) or multiple-domain systems that offer content-based retrieval for multiple types of audio data. Since a single-domain system has limited applications, a multiple-domain system will be more useful. However, different types of audio data will have di...
متن کاملMultifeature Audio Segmentation for Browsing and Annotation
Indexing and content-based retrieval are necessary to handle the large amounts of audio and multimedia data that is becoming available on the web and elsewhere. Since manual indexing using existing audio editors is extremely time consuming a number of automatic content analysis systems have been proposed. Most of these systems rely on speech recognition techniques to create text indices. On the...
متن کاملIntelligent Content-Based Audio Classification and Retrieval for Web Applications
Content-based technology has emerged from the development of multimedia signal processing and wide spread of web application. In this chapter, we discuss the issues involved in the content-based audio classification and retrieval, including spoken document retrieval and music information retrieval. Further, along this direction, we conclude that the emerging audio ontology can be applied in fas...
متن کاملContent-Based Audio Classification and Retrieval Using SVM Learning
In this paper, a support vector machines (SVMs) based method is proposed for content-based audio classification and retrieval. Given a feature set, which in this work is composed of perceptual and cepstral feature, optimal class boundaries between classes are learned from training data by using SVMs. Matches are ranked by using distances from boundaries. Experiments are presented to compare var...
متن کاملPrecision ��
As one of the key methods to extract content semantics and structure from audio, automatic audio classification, especially for a speech and a music, is valuable for content-based audio retrieval, video summary and retrieval, and spoken document retrieval, etc. Because hidden Markov model (HMM) can well model audio signal’s time statistical properties, a left-right discrete HMM is proposed to c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition Letters
دوره 22 شماره
صفحات -
تاریخ انتشار 2001